AITopics | regular language

Country:

Europe > Germany > Saarland (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsFeb-10-2026, 17:44:38 GMT

b04c387c8384ca083a71b8da516f65f6-Supplemental.pdf

concept class, probability, rnn, (16 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(9 more...)

Genre:

Workflow (0.67)
Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceOct-13-2025

A Constructive Framework for Nondeterministic Automata via Time-Shared, Depth-Unrolled Feedforward Networks

Dhayalkar, Sahil Rajesh

We present a formal and constructive simulation framework for nondeterministic finite automata (NFAs) using time-shared, depth-unrolled feedforward networks (TS-FFNs), i.e., acyclic unrolled computations with shared parameters that are functionally equivalent to unrolled recurrent or state-space models. Unlike prior approaches that rely on explicit recurrent architectures or post hoc extraction methods, our formulation symbolically encodes automaton states as binary vectors, transitions as sparse matrix transformations, and nondeterministic branching-including $\varepsilon$-closures-as compositions of shared thresholded updates. We prove that every regular language can be recognized exactly by such a shared-parameter unrolled feedforward network, with parameter count independent of input length. Our construction yields a constructive equivalence between NFAs and neural networks and demonstrates \emph{empirical learnability}: these networks can be trained via gradient descent on supervised acceptance data to recover the target automaton behavior. This learnability, formalized in Proposition 5.1, is the crux of this work. Extensive experiments validate the theoretical results, achieving perfect or near-perfect agreement on acceptance, state propagation, and closure dynamics. This work clarifies the correspondence between automata theory and modern neural architectures, showing that unrolled feedforward networks can perform precise, interpretable, and trainable symbolic computation.

artificial intelligence, computation, machine learning, (17 more...)

2505.2411

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-10-2025, 01:15:08 GMT

The Expressive Capacity of State Space Models: A Formal Language Perspective

Recently, recurrent models based on linear state space models (SSMs) have shown promising performance in language modeling (LM), competititve with transformers. However, there is little understanding of the in-principle abilities of such models, which could provide useful guidance to the search for better LM architectures. We present a comprehensive theoretical study of the capacity of such SSMs as it compares to that of transformers and traditional RNNs. We find that SSMs and transformers have overlapping but distinct strengths. In star-free state tracking, SSMs implement length-generalizing solutions to problems that transformers struggle to represent exactly. They can also model bounded hierarchical structure with optimal memory even without simulating a stack. On the other hand, we identify a design choice in current SSMs that limits their expressive power.

finite precision, ssm, transformer, (15 more...)

Country:

Europe > Germany > Saarland (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-7-2025

On the Hardness of Learning Regular Expressions

Attias, Idan, Reyzin, Lev, Srebro, Nathan, Vardi, Gal

Despite the theoretical significance and wide practical use of regular expressions, the computational complexity of learning them has been largely unexplored. We study the computational hardness of improperly learning regular expressions in the PAC model and with membership queries. We show that PAC learning is hard even under the uniform distribution on the hypercube, and also prove hardness of distribution-free learning with membership queries. Furthermore, if regular expressions are extended with complement or intersection, we establish hardness of learning with membership queries even under the uniform distribution. We emphasize that these results do not follow from existing hardness results for learning DFAs or NFAs, since the descriptive complexity of regular languages can differ exponentially between DFAs, NFAs, and regular expressions.

artificial intelligence, machine learning, regular expression, (14 more...)

2510.04834

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)

Neural Information Processing SystemsOct-3-2025, 04:14:16 GMT

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Other comments will be taken into account in the revision. Reviewer_45: Thank you for the positive feedback and constructive comments.

algorithm, principal shuffle ideal, shuffle ideal, (12 more...)

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Hayashi, Katsuhiko, Kamigaito, Hidetaka

From Formal Language Theory to Statistical Learning: Finite Observability of Subregular Languages

arXiv.org Artificial IntelligenceSep-29-2025

We prove that all standard subregular language classes are linearly separable when represented by their deciding predicates. This establishes finite observability and guarantees learnability with simple linear models. Synthetic experiments confirm perfect separability under noise-free conditions, while real-data experiments on English morphology show that learned features align with well-known linguistic constraints. These results demonstrate that the subregular hierarchy provides a rigorous and interpretable foundation for modeling natural language structure. Our code used in real-data experiments is available at https://github.com/UTokyo-HayashiLab/subregular.

artificial intelligence, machine learning, predicate, (15 more...)

2509.22598

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.60)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Mündler, Niels, Dekoninck, Jasper, Vechev, Martin

Constrained Decoding of Diffusion LLMs with Context-Free Grammars

arXiv.org Artificial IntelligenceAug-18-2025

Large language models (LLMs) have shown promising performance across diverse domains. Many practical applications of LLMs, such as code completion and structured data extraction, require adherence to syntactic constraints specified by a formal language. Yet, due to their probabilistic nature, LLM output is not guaranteed to adhere to such formal languages. Prior work has proposed constrained decoding as a means to restrict LLM generation to particular formal languages. However, existing works are not applicable to the emerging paradigm of diffusion LLMs, when used in practical scenarios such as the generation of formally correct C++ or JSON output. In this paper we address this challenge and present the first constrained decoding method for diffusion models, one that can handle formal languages captured by context-free grammars. We begin by reducing constrained decoding to the more general additive infilling problem, which asks whether a partial output can be completed to a valid word in the target language. This problem also naturally subsumes the previously unaddressed multi-region infilling constrained decoding. We then reduce this problem to the task of deciding whether the intersection of the target language and a regular language is empty and present an efficient algorithm to solve it for context-free languages. Empirical results on various applications, such as C++ code infilling and structured data extraction in JSON, demonstrate that our method achieves near-perfect syntactic correctness while consistently preserving or improving functional correctness. Importantly, our efficiency optimizations ensure that the computational overhead remains practical.

completion, large language model, machine learning, (20 more...)

2508.10111

Country:

Europe > Switzerland > Zürich > Zürich (0.86)
Europe > United Kingdom (0.14)

Genre: Research Report (0.64)

Industry: Materials > Chemicals > Commodity Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsAug-16-2025, 21:20:22 GMT

Learning and Generalization in RNNs

This fixed function is modeled by a neural networks with one hidden-layer.

artificial intelligence, machine learning, natural language, (20 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(9 more...)

Genre:

Workflow (0.67)
Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceMar-18-2025

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Peng, Bo, Zhang, Ruichong, Goldstein, Daniel, Alcaide, Eric, Hou, Haowen, Lu, Janna, Merrill, William, Song, Guangyu, Tan, Kaifeng, Utpala, Saiteja, Wilce, Nathan, Wind, Johan S., Wu, Tianyi, Wuttke, Daniel, Zhou-Zheng, Christian

We present RWKV-7 "Goose", a new sequence modeling architecture, along with pre-trained language models that establish a new state-of-the-art in downstream performance at the 3 billion parameter scale on multilingual tasks, and match current SoTA English language performance despite being trained on dramatically fewer tokens than other top 3B models. Nevertheless, RWKV-7 models require only constant memory usage and constant inference time per token. RWKV-7 introduces a newly generalized formulation of the delta rule with vector-valued gating and in-context learning rates, as well as a relaxed value replacement rule. We show that RWKV-7 can perform state tracking and recognize all regular languages, while retaining parallelizability of training. This exceeds the capabilities of Transformers under standard complexity conjectures, which are limited to $\mathsf{TC}^0$. To demonstrate RWKV-7's language modeling capability, we also present an extended open source 3.1 trillion token multilingual corpus, and train four RWKV-7 models ranging from 0.19 billion to 2.9 billion parameters on this dataset. To foster openness, reproduction, and adoption, we release our models and dataset component listing at https://huggingface.co/RWKV, and our training and inference code at https://github.com/RWKV/RWKV-LM all under the Apache 2.0 License.

large language model, machine learning, natural language, (22 more...)